Anchor and UBM-based multi-class MLLR m-vector system for speaker verification

نویسندگان

  • Achintya Kumar Sarkar
  • Claude Barras
چکیده

In this paper, we propose two techniques to extend the recently introduced global Maximum Likelihood Linear Regression (MLLR) transformation (i.e. super-vector) based m-vector system for speaker verification into a multi-class MLLR mvector system in the Universal Background Model (UBM) framework. In the first method, Gaussian mean vectors of the UBM are first grouped into several classes using conventional K-means and a proposed clustering algorithm based on Expectation Maximization (EM) and Maximum Likelihood (ML) concepts. Then, MLLR transformations are calculated for a given speech data with respect to each class, which are used in the form of super-vector for speaker representation by their mvectors. In the second approach, several MLLR transformations are estimated with respect to pre-defined models called anchors. The proposed systems show better performance than the conventional system. Furthermore, the proposed UBM-based system does not require additional alignment of speech data with respect to the UBM for estimation of multiple MLLR transformations. We also further show that the proposed EM & ML clustering algorithm is robust to random initialization and provides equal or comparable system performance compared to Kmeans. The experimental results are shown on NIST 2008 SRE core condition over various tasks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multiple background models for speaker verification using the concept of vocal tract length and MLLR super-vector

In this paper, we investigate the use of Multiple Background Models (M-BMs) in Speaker Verification (SV). We cluster the speakers using either their Vocal Tract Lengths (VTLs) or by using their speaker specific Maximum Likelihood Linear Regression (MLLR) super-vector, and build a separate Background Model (BM) for each such cluster. We show that the use of M-BMs provide improved performance whe...

متن کامل

Sub-vector Extraction and Cascade Post-Processing for Speaker Verification Using MLLR Super-vectors

In this paper, we propose a speaker-verification system based on maximum likelihood linear regression (MLLR) super-vectors, for which speakers are characterized by m-vectors. These vectors are obtained by a uniform segmentation of the speaker MLLR super-vector using an overlapped sliding window. We consider three approaches for MLLR transformation, based on the conventional 1-best automatic tra...

متن کامل

A new kernel for SVM MLLR based speaker recognition

Speaker recognition using support vector machines (SVMs) with features derived from generative models has been shown to perform well. Typically, a universal background model (UBM) is adapted to each utterance yielding a set of features that are used in an SVM. We consider the case where the UBM is a Gaussian mixture model (GMM), and maximum likelihood linear regression (MLLR) adaptation is used...

متن کامل

Investigation of Speaker-Clustered UBMs based on Vocal Tract Lengths and MLLR matrices for Speaker Verification

It is common to use a single speaker independent large Gaussian Mixture Model based Universal Background Model (GMMUBM) as the alternative hypothesis for speaker verification tasks. The speaker models are themselves derived from the UBM using Maximum a Posteriori (MAP) adaptation technique. During verification, log likelihood ratio is calculated between the target model and the GMM-UBM to accep...

متن کامل

Fast computation of speaker characterization vector using MLLR and sufficient statistics in anchor model framework

Anchor modeling technique has been shown to be useful in reducing computational complexity for speaker identification and indexing of large audio database. In this technique, speakers are projected onto a talker space spanned by a set of predefined anchor models which are usually represented by Gaussian Mixture Models (GMMs). The characterization of each speaker involves calculation of likeliho...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013